9 research outputs found
A functional approach to estimation of the parameters of generalized negative binomial and gamma distributions
The generalized negative binomial distribution (GNB) is a new flexible family
of discrete distributions that are mixed Poisson laws with the mixing
generalized gamma (GG) distributions. This family of discrete distributions is
very wide and embraces Poisson distributions, negative binomial distributions,
Sichel distributions, Weibull--Poisson distributions and many other types of
distributions supplying descriptive statistics with many flexible models. These
distributions seem to be very promising for the statistical description of many
real phenomena. GG distributions are widely applied in signal and image
processing and other practical problems. The statistical estimation of the
parameters of GNB and GG distributions is quite complicated. To find estimates,
the methods of moments or maximum likelihood can be used as well as two-stage
grid EM-algorithms. The paper presents a methodology based on the search for
the best distribution using the minimization of -distances and
-metrics for GNB and GG distributions, respectively. This approach, first,
allows to obtain parameter estimates without using grid methods and solving
systems of nonlinear equations and, second, yields not point estimates as the
methods of moments or maximum likelihood do, but the estimate for the density
function. In other words, within this approach the set of decisions is not a
Euclidean space, but a functional space.Comment: 13 pages, 6 figures, The XXI International Conference on Distributed
Computer and Communication Networks: Control, Computation, Communications
(DCCN 2018
Robust detection of phone segments in continuous speech using model selection criteria
Automatic phone segmentation techniques based on model selection criteria are studied. We investigate the phone boundary detection efficiency of entropy- and Bayesian- based model selection criteria in continuous speech based on the DISTBIC hybrid segmentation algorithm. DISTBIC is a text-independent bottom-up approach that identifies sequential model changes by combining metric distances with statistical hypothesis testing. Using robust statistics and small sample corrections in the baseline DISTBIC algorithm, phone boundary detection accuracy is significantly improved, while false alarms are reduced. We also demonstrate further improvement in phonemic segmentation by taking into account how the model parameters are related in the probability density functions of the underlying hypotheses as well as in the model selection via the information complexity criterion and by employing M-estimators of the model parameters. The proposed DISTBIC variants are tested on the NTIMIT database and the achieved measure is 74.7% using a 20-ms tolerance in phonemic segmentation. © 2009 IEEE
Focused Crawling through Reinforcement Learning
International audienceFocused crawling aims at collecting as many Web pages relevant to a target topic as possible while avoiding irrelevant pages, reflecting limited resources available to a Web crawler. We improve on the efficiency of focused crawling by proposing an approach based on reinforcement learning. Our algorithm evaluates hyperlinks most profitable to follow over the long run, and selects the most promising link based on this estimation. To properly model the crawling environment as a Markov decision process, we propose new representations of states and actions considering both content information and the link structure. The size of the state-action space is reduced by a generalization process. Based on this generalization, we use a linear-function approximation to update value functions. We investigate the trade-off between synchronous and asynchronous methods. In experiments, we compare the performance of a crawling task with and without learning; crawlers based on reinforcement learning show better performance for various target topics